众所周知,深度神经网络(DNNS)通过特别注意某些特定像素来对输入图像进行分类。对每个像素的注意力的图形表示称为显着图。显着图用于检查分类决策基础的有效性,例如,如果DNN对背景而不是图像的主题更加关注,则它不是分类的有效基础。语义扰动可以显着改变显着性图。在这项工作中,我们提出了第一种注意鲁棒性的验证方法,即显着映射对语义扰动的组合的局部稳健性。具体而言,我们的方法确定了扰动参数的范围(例如,亮度变化),该参数维持实际显着性映射变化与预期的显着映射图之间的差异低于给定的阈值。我们的方法基于激活区域遍历,重点是最外面的鲁棒边界,以在较大的DNN上可伸缩。实验结果表明,无论语义扰动如何,我们的方法都可以显示DNN可以与相同基础进行分类的程度,并报告激活区域遍历的性能和性能因素。
translated by 谷歌翻译
分布式推理(DI)框架已经获得了牵引力作为用于实时应用的技术,用于在资源受限的内容(物联网)设备上的尖端深机学习(ML)。在DI中,计算任务通过IOT设备通过有损的物联网网络从物联网设备卸载到边缘服务器。然而,通常,在通信延迟和可靠性之间存在通信系统级权衡;因此,为了提供准确的DI结果,需要一种可靠和高等待的通信系统来调整,这导致DI的不可忽略的端到端潜伏期。这激励我们通过ML技术的努力来改善通信延迟与准确性之间的权衡。具体而言,我们提出了一种以通信为导向的模型调谐(ComTune),其旨在通过低延迟但不可靠的通信链路实现高度精确的DI。在Comtune中,关键的想法是通过应用辍学技术的应用来微调不可靠通信链路的效果。这使得DI系统能够针对不可靠的通信链路获得鲁棒性。我们的ML实验表明,ComTune使得能够以低延迟和有损网络在低延迟和损失网络下准确预测。
translated by 谷歌翻译
本文提出了一种完全分散的联邦学习(FL)方案,用于通过多跳网络连接的所有内容(IOE)设备。由于FL算法几乎没有收敛机器学习(ML)模型的参数,因此本文侧重于功能空间中ML模型的收敛性。考虑到ML任务的代表性损失函数例如,均方误差(MSE)和Kullback-Leibler(KL)发散,是凸起的功能,直接更新功能空间中的功能的算法可以收敛到最佳解决方案。本文的关键概念是定制基于共识的优化算法,可以在功能空间中工作,以分布式方式实现全局最佳。本文首先分析了函数空间中所提出的算法的收敛,其被称为元算法,并且示出了频谱图理论可以以类似于数值矢量的方式应用于函数空间。然后,为神经网络(NN)开发了基于共识的多跳联盟蒸馏(CMFD)以实现元算法。 CMFD利用知识蒸馏来实现相邻器件之间的功能聚集而没有参数平均。 CMFD的一个优点是它即使在分布式学习者中使用不同的NN模型也是如此。虽然CMFD不完全反映元算法的行为,但元算法的融合属性的讨论促进了对CMFD的直观理解,并且模拟评估表明,NN模型会聚使用CMFD进行多种任务。仿真结果还表明,CMFD比弱连接网络的参数聚合实现更高的准确性,CMFD比参数聚合方法更稳定。
translated by 谷歌翻译
Classification bandits are multi-armed bandit problems whose task is to classify a given set of arms into either positive or negative class depending on whether the rate of the arms with the expected reward of at least h is not less than w for given thresholds h and w. We study a special classification bandit problem in which arms correspond to points x in d-dimensional real space with expected rewards f(x) which are generated according to a Gaussian process prior. We develop a framework algorithm for the problem using various arm selection policies and propose policies called FCB and FTSV. We show a smaller sample complexity upper bound for FCB than that for the existing algorithm of the level set estimation, in which whether f(x) is at least h or not must be decided for every arm's x. Arm selection policies depending on an estimated rate of arms with rewards of at least h are also proposed and shown to improve empirical sample complexity. According to our experimental results, the rate-estimation versions of FCB and FTSV, together with that of the popular active learning policy that selects the point with the maximum variance, outperform other policies for synthetic functions, and the version of FTSV is also the best performer for our real-world dataset.
translated by 谷歌翻译
Microswimmers can acquire information on the surrounding fluid by sensing mechanical queues. They can then navigate in response to these signals. We analyse this navigation by combining deep reinforcement learning with direct numerical simulations to resolve the hydrodynamics. We study how local and non-local information can be used to train a swimmer to achieve particular swimming tasks in a non-uniform flow field, in particular a zig-zag shear flow. The swimming tasks are (1) learning how to swim in the vorticity direction, (2) the shear-gradient direction, and (3) the shear flow direction. We find that access to lab frame information on the swimmer's instantaneous orientation is all that is required in order to reach the optimal policy for (1,2). However, information on both the translational and rotational velocities seem to be required to achieve (3). Inspired by biological microorganisms we also consider the case where the swimmers sense local information, i.e. surface hydrodynamic forces, together with a signal direction. This might correspond to gravity or, for micro-organisms with light sensors, a light source. In this case, we show that the swimmer can reach a comparable level of performance as a swimmer with access to lab frame variables. We also analyse the role of different swimming modes, i.e. pusher, puller, and neutral swimmers.
translated by 谷歌翻译
Understanding the dynamics of a system is important in many scientific and engineering domains. This problem can be approached by learning state transition rules from observations using machine learning techniques. Such observed time-series data often consist of sequences of many continuous variables with noise and ambiguity, but we often need rules of dynamics that can be modeled with a few essential variables. In this work, we propose a method for extracting a small number of essential hidden variables from high-dimensional time-series data and for learning state transition rules between these hidden variables. The proposed method is based on the Restricted Boltzmann Machine (RBM), which treats observable data in the visible layer and latent features in the hidden layer. However, real-world data, such as video and audio, include both discrete and continuous variables, and these variables have temporal relationships. Therefore, we propose Recurrent Temporal GaussianBernoulli Restricted Boltzmann Machine (RTGB-RBM), which combines Gaussian-Bernoulli Restricted Boltzmann Machine (GB-RBM) to handle continuous visible variables, and Recurrent Temporal Restricted Boltzmann Machine (RT-RBM) to capture time dependence between discrete hidden variables. We also propose a rule-based method that extracts essential information as hidden variables and represents state transition rules in interpretable form. We conduct experiments on Bouncing Ball and Moving MNIST datasets to evaluate our proposed method. Experimental results show that our method can learn the dynamics of those physical systems as state transition rules between hidden variables and can predict unobserved future states from observed state transitions.
translated by 谷歌翻译
We propose a novel backpropagation algorithm for training spiking neural networks (SNNs) that encodes information in the relative multiple spike timing of individual neurons without single-spike restrictions. The proposed algorithm inherits the advantages of conventional timing-based methods in that it computes accurate gradients with respect to spike timing, which promotes ideal temporal coding. Unlike conventional methods where each neuron fires at most once, the proposed algorithm allows each neuron to fire multiple times. This extension naturally improves the computational capacity of SNNs. Our SNN model outperformed comparable SNN models and achieved as high accuracy as non-convolutional artificial neural networks. The spike count property of our networks was altered depending on the time constant of the postsynaptic current and the membrane potential. Moreover, we found that there existed the optimal time constant with the maximum test accuracy. That was not seen in conventional SNNs with single-spike restrictions on time-to-fast-spike (TTFS) coding. This result demonstrates the computational properties of SNNs that biologically encode information into the multi-spike timing of individual neurons. Our code would be publicly available.
translated by 谷歌翻译
自然语言推理(NLI)和语义文本相似性(STS)是广泛使用的基准任务,用于对预训练的语言模型进行组成评估。尽管对语言普遍性的兴趣越来越大,但大多数NLI/STS研究几乎完全集中在英语上。特别是,日语中没有可用的多语言NLI/STS数据集,它在类型上与英语不同,并且可以阐明语言模型当前有争议的行为,例如对单词顺序和案例粒子的敏感性。在此背景下,我们介绍了日本NLI/STS数据集Jsick,该数据集是从英语数据集病中手动翻译的。我们还提出了一个用于组成推断的应力测试数据集,该数据集是通过转换JSick中句子的句法结构来研究语言模型是否对单词顺序和案例粒子敏感的。我们在不同的预训练语言模型上进行基线实验,并比较应用于日语和其他语言时多语言模型的性能。应力测试实验的结果表明,当前的预训练的语言模型对单词顺序和案例标记不敏感。
translated by 谷歌翻译
来自重力波检测器的数据中出现的瞬态噪声通常会引起问题,例如检测器的不稳定性以及重叠或模仿重力波信号。由于瞬态噪声被认为与环境和工具相关联,因此其分类将有助于理解其起源并改善探测器的性能。在先前的研究中,提出了用于使用时频2D图像(频谱图)进行瞬态噪声进行分类的体系结构,该架构使用了无监督的深度学习与变异自动编码器和不变信息集群的结合。提出的无监督学习结构应用于重力间谍数据集,该数据集由高级激光干涉仪重力波动台(Advanced Ligo)瞬态噪声与其相关元数据进行讨论,以讨论在线或离线数据分析的潜力。在这项研究的重点是重力间谍数据集中,研究并报告了先前研究的无监督学习结构的训练过程。
translated by 谷歌翻译
量子计算已经从理论阶段转变为实用阶段,在实施物理量子位时提出了艰巨的挑战,物理量子位受到周围环境的噪音。这些量子噪声在量子设备中无处不在,并在量子计算模型中产生不利影响,从而对其校正和缓解技术进行了广泛的研究。但是,这些量子声总是会提供缺点吗?我们通过提出一个称为量子噪声诱导的储层计算的框架来解决此问题,并表明某些抽象量子噪声模型可以诱导时间输入数据的有用信息处理功能。我们在几个典型的基准中证明了这种能力,并研究了信息处理能力,以阐明框架的处理机制和内存概况。我们通过在许多IBM量子处理器中实现框架,并通过模型分析获得了相似的特征内存配置文件来验证我们的观点。令人惊讶的是,随着量子设备的较高噪声水平和错误率,信息处理能力增加了。我们的研究为将有用的信息从量子计算机的噪音转移到更复杂的信息处理器上开辟了一条新的道路。
translated by 谷歌翻译